Consistency of Random Forests
نویسندگان
چکیده
Random forests are a learning algorithm proposed by Breiman (2001) which combines several randomized decision trees and aggregates their predictions by averaging. Despite its wide usage and outstanding practical performance, little is known about the mathematical properties of the procedure. This disparity between theory and practice originates in the difficulty to simultaneously analyze both the randomization process and the highly data-dependent tree structure. In the present paper, we take a step forward in forest exploration by proving a consistency result for Breiman’s (2001) original algorithm in the context of additive regression models. Our analysis also sheds an interesting light on how random forests can nicely adapt to sparsity in high-dimensional settings. Index Terms — Random forests, randomization, consistency, additive model, sparsity, dimension reduction. 2010 Mathematics Subject Classification: 62G05, 62G20.
منابع مشابه
Bernoulli Random Forests: Closing the Gap between Theoretical Consistency and Empirical Soundness
Random forests are one type of the most effective ensemble learning methods. In spite of their sound empirical performance, the study on their theoretical properties has been left far behind. Recently, several random forests variants with nice theoretical basis have been proposed, but they all suffer from poor empirical performance. In this paper, we propose a Bernoulli random forests model (BR...
متن کاملConsistency of Online Random Forests
As a testament to their success, the theory of random forests has long been outpaced by their application in practice. In this paper, we take a step towards narrowing this gap by providing a consistency result for online random forests.
متن کاملOn the asymptotics of random forests
The last decade has witnessed a growing interest in random forest models which are recognized to exhibit good practical performance, especially in high-dimensional settings. On the theoretical side, however, their predictive power remains largely unexplained, thereby creating a gap between theory and practice. The aim of this paper is twofold. Firstly, we provide theoretical guarantees to link ...
متن کاملConsistency of Random Survival Forests.
We prove uniform consistency of Random Survival Forests (RSF), a newly introduced forest ensemble learner for analysis of right-censored survival data. Consistency is proven under general splitting rules, bootstrapping, and random selection of variables-that is, under true implementation of the methodology. Under this setting we show that the forest ensemble survival function converges uniforml...
متن کاملSparsity and Inverse Problems in Statistical Theory and Econometrics
In the last years of his life, Leo Breiman promoted random forests for use in classification. He suggested using averaging as a means of obtaining good discrimination rules. The base classifiers used for averaging are simple and randomized, often based on random samples from the data. He left a few questions unanswered regarding the consistency of such rules. In this talk, we give a number of t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014